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Abstract 

The first self-stabilizing algorithm 0j assumed the existence of a central daemon, that 
activates one processor at time to change state as a function of its own state and the 
state of a neighbor. Subsequent research has reconsidered this algorithm without the 
assumption of a central daemon, and under different forms of communication, such as the 
model of link registers. In all of these investigations, one common feature is the atomicity of 
communication, whether by shared variables or read/write registers. This paper weakens 
the atomicity assumptions for the communication model, proposing versions of B that 
tolerate various weaker forms of atomicity. First, a solution for the case of regular registers 
is presented. Then the case of safe registers is considered, with both negative and positive 
results presented. The paper also presents an implementation of [jl| based on registers that 
have probabilistically correct behavior, which requires a notion of weak stabilization. 

1 Introduction 

The self-stabilization concept is not tied to particular system settings. Our work considers sev- 
eral new system settings and demonstrates the applicability of the self-stabilization paradigm 
to these systems. In particular, we investigate systems with regular and safe registers and 
present modifications of Dijkstra's first self-stabilizing algorithm |l| that stabilizes in these 
systems. 

The solution for the regular registers case use a special label in between writes of labels. 
In the case of safe registers we prove impossibility results, for the cases in which neighboring 
processors use a single safe register to communicate between themselves — where the register 
is/isn't divided to multiple fields. In the positive side, we define a composite safe register that 
roughly speaking ensures that reads return at most one corrupted field and design an algorithm 
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for that case. Then we allow the processors to read the value written in their registers (therefore 
avoiding extra writes for refreshes). We present two algorithms for the above case, one that 
uses unary encoding and another that is based on Gray code. 

Then we introduce randomized registers that, roughly speaking, return the "correct value" 
with probability p. It is impossible to ensure closure in such a system, since all reads may 
return incorrect values. We introduce the notion of weak self-stabilization for such systems. 
We use Markov chains to compute the ratio between the number of safe configurations and 
unsafe configurations in an infinite execution. 

Markov chains associate each state (system configuration) with a probability to be in this 
state during an infinite execution. The fixed probability of the state is a "stabilizing" value. 
It is clear that the probability is either zero or one in the first configuration. Given the 
probability of transitions between configurations, one can compute the stable probability in 
an infinite execution, which is typically greater than zero and less than one. We found the 
definition of weak stabilization and the use of Markov chains to be an interesting and promising 
way for extending the applicability of the self-stabilizing concept. 

The remainder of the paper is organized as follows. In the next section we describe a 
solution for regular registers. Then in Section ^ we present impossibility results and algorithms 
for different settings of systems that use safe registers. Randomized registers and the use of 
Markov chains are presented in Section |I[ Detailed proofs are omitted from this extended 
abstract. 

2 Regular Registers 

Before we introduce our results for the case of regular registers let us presents "folklore" results 
concerning read/ write registers. 

Read/Write Atomicity: It is known that n — 1 labels are sufficient for the convergence of 
Dijkstra algorithm assuming a central daemon, where n is the number of processors in the 
ring. We next prove that n — 2 labels are not sufficient. 

Lower bound: Consider the case of n — 2 states in a system of n = 5 processors. Thus 
there are three possible processor states, which we label {0,1,2}. To prove impossibility we 
demonstrate a non-converging sequence of transitions (the key to constructing the sequence is 
to maintain all three types of labels in each system state, which violates the key assumption 
for the proof of convergence) . 

{0,0,2, 1,0} ^{1,0,2, 1,0} ^{1,0,2, 1,1} ^{1,0,2,2,1} ^{1,0,0,2,1} ^{1,1,0,2,1}. 

We now present a reduction (see [|]]) of a ring with 2n processors that is activated by a 
central daemon to a ring with n processors that assumes read write atomicity. We conclude 
that at least 2n — 1 states are required. 

Each processor pj has an internal variable in which pj stores the value pj reads from Pj-i- 
Each read is a copy to an internal variable and each write is a copy of internal variable to a 
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register. Thus, we have in fact a ring of 2n processors in a system with a central daemon. 
Hence, 2n — 1 states are required and are sufficient. 

We now turn to design an algorithm for the case of regular registers. Informally, a regular 
register has the property that a read operation concurrent with a write operation can return 
either the "old" or "new" value. More formally, to define a regular register r we need to define 
the possible values that a read operation from r returns. Let x° be the value of the last write 
operation in r that ends prior to the beginning of the read operation (let x° be the initial value 
of r if no such write exists). 

A read operation from a regular register r that is not executed concurrently with a write 
operation to r returns x°. A read operation from a regular register r that is executed con- 
currently with a write of a value x 1 returns either x° or x . Note that more generally, a read 
concurrent with a sequence of write operations of the values x x ,x 2 , • • • to r could return any 
x k , however once a read returns x k for k > 1, no subsequent read by the same reader will 
return x J for j < k — 1. 

A naive implementation of Dijkstra's algorithm using regular registers may result in the 
following execution: 
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Figure 1: Straightforward regular register implementation fails 



We have started in a safe configuration in which all the values (in the registers and the 
internal variables) are and we have reached a configuration in which all the processors may 
simultaneously change a state. 

To overcome the above difficulty we introduce a new value _L that is written before any 
change of a value of a register. The algorithm for the case of regular registers appears in 
Figure In the figure, IRj is the input register for pi (thus IRj is the output register of Pi~i). 
Variable X{ contains the counter defined for Dijkstra's algorithm, and variable ti is introduced 
to emphasize the fine-grained atomicity of the model (one step reads a register, and the value 
it returns is tested in another step). 

A safe configuration is a configuration in which all the registers have the same value, say 
x, and every read operation that has already started will return x. For simplicity we assume 
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there are 2n + 1 states. Therefore, it is clear that a state is missing in the initial configuration, 
say the state y. Hence, when p\ writes y, p\ does not change its state before reading it from p n . 
p n can read y only when p n -i has the state y. Any read operation of p n -i that starts following 
the write operation that assigns y to p n -\ may return either _L or y, which is effectively y (see 
lines 3 to 6 and 10 to 13 of the code). 



1 Pi. 


do forever 


2 


read t± := IRi 


3 


if t\ = x\ then 


4 


xi := (xi + 1) mod (2n + 1) 


5 


write IR2 := i- 


6 


write IR2 := x\ 


7 


else write IR2 := x\ 


8 pi (i ^ 1): 


do forever 


9 


read t; L := IRj 


10 


if xi / ti then 


11 


Xi . — ti 


12 


write IRj+i := _L 


13 


write IRj+i := Xj 


14 


else write IRj+i := Xj 



Figure 2: 

Dijkstra's Algorithm for Regular Registers 



3 Safe Registers 

Safe registers have the weakest properties of any in Lamport's hierarchy. A read concurrent 
with a write to a safe register can return any value in the register's domain, even if the value 
being written is already equal to what the register contains. There are two cases to consider 
for the model of safe registers. If a processor is unable to read the register(s) that it writes, we 
can show that Dijkstra's algorithm cannot be implemented. We initially consider the model 
of a single link register for each processor under the restriction that a writer is unable to read 
its output registers. 

Lemma 3.1 Dijkstra's algorithm cannot be implemented using only a single 1W1R safe reg- 
ister between pi and pi + i . 

Proof: Processor pi (i 7^ 1) that copies from the output register of Pi-i must continually 
rewrite its output register for pi + \ — otherwise there can be a deadlock where the value 
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written by pi is different from the value pi reads from pi-\. Similarly, p\ must repeatedly 
write, otherwise there can be a deadlock where all the registers have the same value, and the 
pi program counter is past the first write to its register that incremented this value. Therefore, 
processors continually write into their output registers. Since all processors repeatedly write 
their output registers, we can construct an execution where reads are concurrent with writes 
and obtain arbitrary values. This construction can be used to show that the protocol does not 
converge (and also that it is not stable). ■ 

Multiple fields safe register: The next result we present is impossibility for the case of 
multiple safe registers per processor, but where processors cannot read the registers they write. 
Suppose each processor pt has m safe registers to write, which reads, and also pi reads m 
safe registers written by Pi—i- If a protocol allows a state in which a processor does not write any 
of its registers so long as its state does not change, then we may construct a deadlock because 
the local state of the processor differs from the encoding of values contained in its output 
registers. Therefore, in any implementation of the protocol, we can construct an execution 
fragment so that any chosen processor pi writes at least some of its registers t times, for 
arbitrary t > 0, and during the same execution fragment, Pi—\ takes no steps. Moreover, if 
Pi does not write to all m registers, then the registers it does not write can have arbitrary 
values inherited from the initial state. Therefore, Pi+i can read any value from pi, since at 
each step of pi + \ reading one of the m registers written by pi, we can construct an execution 
in which pi is concurrently writing to the same safe register. Because Pi+\ can read any value, 
it is possible that for i ^ n that Pi+\ reads a value equal to its own current value, which for 
Dijkstra's algorithm, means that Pi+± will maintain its current value rather than changing it; 
for the case i = n, there is an execution where each time p\ reads its input registers, the value 
read differs from its own value, and again p\ makes no change to its current value. These 
situations can repeat indefinitely with no processor entering the critical section. 

Composite safe register: Next we sketch a solution in which fields of the registers can be 
written and the entire register is read at once. We call such a register composite safe register. 
A read from a composite safe register may return an arbitrary value for at most one of the 
register fields, a field in which a write is executed concurrently to the readQ. We note that 
there is a natural extension of our algorithm in which at most k fields of a register may return 
an arbitrary value. 

Each bit of the label value is stored in three 1-bit safe registers (three fields). This will 
ensure that a read during a refresh operation will return the value of the register. Assume 
that the value 101 is stored in nine 1-bit safe registers as 111000111. Assume further that a 
processor refreshes the value written in these registers each time writing in one of the 1-bit 
safe registers. A read operation returns the value of the entire composite safe register in which 
at most one bit is wrong. The Hamming distance ensures that the original value of the label 
bit can be determined. 

1 This assumption reflects reality in system in which a read operation is much faster than a write operation. 
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To allow a value change we add a three bits guard value. Hence, the composite safe register 
has three bits that function as a guard value and 3 x 2(n + 1) bits for the label. 

A processor pi, i ^ 1, that reads a new value from Pi-\ first sets the guard value to 
(writing 000 in the guard bits), and then changes the value of the label, pi writes 111 to its 
guard bits once p\ finishes updating the label. 

A processor pi that reads a guard value does not use the value read. When pi reads a 
guard value 1 it examines the value it read. 

The correctness proof starts in convincing ourselves that after the first time a processor pi 
refreshes (or writes a new value in) its register any read operation from its register (that returns 
a value) results in the last value written to this register. p\ eventually writes a non existing 
label, this label cleans the system. More details are omitted from this extended abstract. 

Safe registers with reads instead of refreshes: Given the above impossibility results, we 
examine settings where a processor can read the contents of the registers in which it writes. 
Consider 2n + 1 single bit, safe, 1W2R registers rather than a single register per processor. 
Each processor maintains a counter with domain [l,2n + 1] for Dijkstra's algorithm. Unary 
encoding represents this counter: for a counter value k, the proper encoding is to write all 
registers except for the register with index k, which has value 1. 



1 Pi- 


do forever 


2 


do k : = 1 to 2n + 1 


3 


if jfe t^xi A IR 2 [fc] = l 


4 


write \R 2 [k] := 


5 


if jfe = xi A IR 2 [fc]=0 


6 


write \R 2 [k] := 1 


7 


do s : = 0, j := 0, k := 1 to 2ra + 1 


8 


if IRi[fe] = 1 


9 


s := s + 1; j := k 


10 


if s = 1 A j = x\ 


11 


X\ := 1 + x\ mod (2n + 1) 


1 Pi (i+ 1): 


do forever 


2-8 


(same as for p\) 


10 


if S = 1 A j y^Xi 


11 


Xi := j 



Figure 3: 

Dijkstra's Algorithm for Safe Registers 
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A legitimate configuration for this protocol is that each register vector represents the pro- 
cessor's last counter value (it differs only when a processor updates its counter) and counters 
correspond to Dijkstra's algorithm. 

Lemma 3.2 Figure^ is a self-stabilizing implementation of Dijkstra's algorithm. 

Proof: There are two proof obligations, stability (closure) from legitimate configurations and 
convergence from arbitrary configurations to legitimate ones. 

Closure. It is straightforward to verify that in any processor cycle from a legitimate con- 
figuration, a processor writes to at most two registers as it changes the counter value. Thus 
when the neighbor reads these registers, at most two reads can have incorrect values due to 
concurrent writing. If both have correct values, the token passes correctly (a subsequent read 
by the process can still obtain an incorrect value, but only by getting for all reads, which 
causes no harm). If both have incorrect values, then the reader observes no change in counter 
values. If just one returns an incorrect value, then the reader observes parity of zero, which is 
harmless. This reasoning shows that the protocol is stable. 

Convergence. The remaining task is to verify that the protocol guarantees to reach a legiti- 
mate configuration in any execution. Suppose all processors have completed at least one cycle 
of statements 1-11. In the subsequent execution, a processor only writes a register if that reg- 
ister requires change to agree with the processor's counter. Note that by standard arguments, 
no deadlock is possible in this system and that p\ increments its counter infinitely many times 
in an execution. It is still possible that one processor can read more than two incorrect values 
due to concurrent writes (consider an initial state with many counter values; as these values 
are propagated to some pi, it could be that Pi+i happens to read many registers concurrent 
with pi writing to them). Since the counter range is [1, 2n + 1] and there are n processors, it 
follows that at least one counter value t is not present in the system. By the arguments given 
for the proof of closure, no processor incorrectly reads input registers to get the value t in 
such a configuration. Because p\ increments x\ infinitely, we can suppose x\ = t but no other 
processor or register encoding equals t, and by standard arguments (and the propagation of 
values observed in the proof of closure), a legitimate configuration eventually is reached. ■ 

The protocol of Figure [3| uses an expensive encoding of counter values, requiring 2n + 1 
separate registers. The argument for closure shows that changing a counter and transmitting 
it is effectively an atomic transfer of the value — once the new value is observed, then any 
subsequent read of the registers either returns the new value or some invalid value (where the 
sum of bits does not equal 1), which is ignored. Note that this technique is not a general 
implementation of an atomic register from safe registers; it is specific to the implementation 
of Dijkstra's algorithm. 

Can we do better than using 2n + 1 registers? The following protocol uses the Gray code 
representation of the counter, plus a extra bit for parity. The number of registers per processor 
is m + 1 where m = |~lg(2ra + 1)] . 



7 



1 pi: 


do forever 


2 


do k : = 1 to ?7i 


3 


if 1 R2 [A;] 7^ Gra?/code(xi)[fc] 


4 


write 1 R2 [fc] := Gray code [x\)[k\ 


5 


if IR2[m + 1] 7^ parity ( Gray code (xi)) 


6 


write IR2[m + 1] := parity (Graycode(xi)) 


7 


do := 1 to m 


Q 
O 


g[KJ .— IKi[KJ 


9 


if IRi[m + 1] = parity(g) A Graycode(xi) = g 


10 


xi := (xi + 1) mod (2ra + 1) 


1 pi (i 7^ 1): 


do forever 


2-8 


(same as for pi) 


10 


if IRj[m + 1] = parity (g) A Graycode(xi) 7^ 5 


11 


:= g 



Figure 4: 

Dijkstra's Algorithm using Gray Code 



Lemma 3.3 Figure^ is a self- stabilizing implementation of Dijkstra's algorithm. 



Proof: The closure argument is the same as given in the proof of Lemma |3,2| , inspecting 
each of the four cases of reading overlapping with writing of the two bits that change when a 
processor changes its counter and writes the one new Gray code bit and the parity bit. In each 
case, the neighbor processor either reads the old value, or ignores the values it reads (because 
parity is incorrect), or obtains the new counter value. The change from old to new counter 
value is essentially atomic. 

Proof of convergence requires new arguments. Consider some configuration of an execution 
prior to which each processor has completed at least two cycles of statements 1-11 in Figure [|, so 
that output registers agree with counter values (unless the processor has read a new value and 
updated its counter). Observe that thereafter, if processor pi successively reads two different 
Gray code values from its input registers, each with correct parity, then p%-i concurrently 
wrote at least once to its output registers. Moreover, if pi successively reads k different Gray 
code values with correct parity, then pi+i wrote at least k times a new counter value and read 
at least k — 1 times from its own input registers, by the structure of the loop (statements 1-11) 
in Figure A consequence of these observations is that if pi successively reads k different 
counter values with correct parity, then p n -k wrote at least one new counter value in the same 
period. In particular, if pi successively reads n + 2 different counter values, then we may 
assert that P2 read pi's output registers and wrote a new counter in the same period. By the 
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standard argument refuting deadlock, processor p\ increments its counter infinitely often in 
any execution. Therefore we can consider an execution suffix starting with x\ = 0. In the 
typical reflected Gray code, the high-order bit starting from x\ = does not change until the 
counter has incremented 2 m times. Therefore, until p\ has incremented x\ at least 2 m times, 
any read by P2 obtains a value with zero in the high-order bit. The observations above imply 
that, before x\ changes at the high-order bit, each processor has copied some counter value 
obtained via p\ — such counter values may be inaccurate due to reads overlapping writes or 
more than one write (bit change) for one scan of a set of registers, however the value for the 
high-order bit stabilizes to zero in this execution fragment. In a configuration where no counter 
or register set has 1 in the high-order bit, the event of p\ changing the high-order bit creates 
a unique occurrence of 1 in that position. Since p\ does not again change its counter until 
observing the same value from p n , convergence is guaranteed. ■ 



4 Randomized State Reads and Weak Stabilization 

Consider a system with a fair central daemon, in any given configuration the daemon activates 
each of the processors with equal probability. A system is weakly stabilizing if, in any execution, 
the probability that the system remains in any set of illegitimate configurations is zero. This 
definition implies that a weakly stabilizing system has the property that its state is infinitely 
often legitimate. In addition, one can sum up the probabilities for being in a legitimate state 
and use this value to compare algorithms. 

To apply the definition of weak stabilization, we model register behavior probabilistically: 
a processor that makes a transition may "read" an incorrect value and therefore make an 
errant transition. We use Markov chains to analyze the percentage of the execution in which 
the system will not be in a safe configuration. See Q for a description of Markov chains. 

We continue describing our approach using a system of three processors and two states. 
The transitions and probabilities of the system appear in Figure |5[ 

A read of a neighboring state returns with probability p > 1/2 the correct value. Each 
configuration has four outgoing arrows, one arrow for each state change of a processor, and one 
for staying in the same state. There are two possibilities for a state transition of a processor, 
one when the read returns the correct value (probability p) and one when the read returns 
a wrong value (probability 1 — p) . Since the daemon chooses to activate each processor with 
equal probability, we have to use a factor 1/3 for the above probabilities. 

We now choose specific values for p and compute powers of the probability matrix V, such 
that the matrix in power i and i + 1 are equal (V % = V l+l ). Then we conclude the percentage 
of being in a legal configuration (not in the configurations 010 or 101). The following table 
shows different values for p (1, 3/4, 1/2, 1/4) and the corresponding equilibrium vector £ T ; 
two figures display the transition matrix V for the cases of p = 1 and p = 3/4. 
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Matrix 


P 


Equilibrium Vector 


Fig| 


1 


[1/6,1/6,0,1/6,1/6,0,1/6,1/6] 


Fig0 


3/4 


[3/20,3/20,1/20,3/6,3/6,1/20,3/20,3/20] 




1/2 


[1/8,1/8,1/8,1/8,1/8,1/8,1/8,1/8] 




1/4 


[1/12,1/12,1/4,1/12,1/12,1/4,1/12,1/12] 



The vectors show that the equilibrium probability for illegitimate configurations is zero for 
the deterministic case, then increasing as p reduces. Clearly, we can investigate the behavior 
of other systems with a range of probabilities, using the same approach. The results can assist 
us in comparing different system designs. 

Lemma 4.1 Dijkstra's algorithm is weakly stabilizing when register reads are correct with 
probability p > 0. 
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Figure 6: 
Transition Matrix for P = 1 
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Figure 7: 

Transition Matrix for p = 3/4 factorized by 3 
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